A comparative study of divisive hierarchical clustering algorithms

نویسنده

  • Maurice Roux
چکیده

A general scheme for divisive hierarchical clustering algorithms is proposed. It is made of three main steps : first a splitting procedure for the subdivision of clusters into two subclusters, second a local evaluation of the bipartitions resulting from the tentative splits and, third, a formula for determining the nodes levels of the resulting dendrogram. A handfull of such algorithms is given. These algorithms are compared using the GoodmanKruskal correlation coefficient. As a global criterion it is an internal goodness-of-fit measure based on the set order induced by the hierarchy compared to the order associated to the given dissimilarities. Applied to a hundred of random data tables, these comparisons are in favor of two methods based on non-usual ratio-type formulas for the splitting procedures, namely the Silhouette criterion and the Dunn's criterion. These two criteria take into account both the within cluster and the between cluster mean dissimilarity. In general the results of these two algorithms are better than the classical Agglomerative Average Link method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical Clustering Algorithm - A Comparative Study

Clustering is a data mining (machine learning) technique used to place data elements into related groups without advance knowledge on the group definitions. In this paper the authors provides an in depth explanation of implementation of agglomerative and divisive clustering algorithms for various types of attributes. Database-the details of the victims of Tsunami in

متن کامل

On the performance of bisecting K - means and PDDP * Sergio

problem is known as bisecting divisive clustering. Note that by recursively using a divisive bisecting clustering procedure, the dataset can be partitioned into any given number of clusters. Interestingly enough, the clusters so-obtained are structured as a hierarchical binary tree (or a binary taxonomy). This is the reason why the bisecting divisive approach is very attractive in many applicat...

متن کامل

Agglomerative and Divisive Approaches to Unsupervised Learning in Gestalt Clusters

Hierarchical clustering algorithms can be agglomerative or divisive, depending on how partitions are formed. Such algorithms have advantages mainly related to the desired level of granularity the partition should have. The work described in this paper approaches two hierarchical algorithms, one agglomerative (and three of its variants) and the other divisive, focusing on their performance in un...

متن کامل

Assessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories

In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...

متن کامل

Approximation Bounds for Hierarchical Clustering: Average Linkage, Bisecting K-means, and Local Search

Hierarchical clustering is a data analysis method that has been used for decades. Despite its widespread use, the method has an underdeveloped analytical foundation. Having a well understood foundation would both support the currently used methods and help guide future improvements. The goal of this paper is to give an analytic framework to better understand observations seen in practice. This ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1506.08977  شماره 

صفحات  -

تاریخ انتشار 2015